43 research outputs found

    Enumerating five families of pattern-avoiding inversion sequences; and introducing the powered Catalan numbers

    Full text link
    The first problem addressed by this article is the enumeration of some families of pattern-avoiding inversion sequences. We solve some enumerative conjectures left open by the foundational work on the topics by Corteel et al., some of these being also solved independently by Lin, and Kim and Lin. The strength of our approach is its robustness: we enumerate four families F1⊂F2⊂F3⊂F4F_1 \subset F_2 \subset F_3 \subset F_4 of pattern-avoiding inversion sequences ordered by inclusion using the same approach. More precisely, we provide a generating tree (with associated succession rule) for each family FiF_i which generalizes the one for the family Fi−1F_{i-1}. The second topic of the paper is the enumeration of a fifth family F5F_5 of pattern-avoiding inversion sequences (containing F4F_4). This enumeration is also solved \emph{via} a succession rule, which however does not generalize the one for F4F_4. The associated enumeration sequence, which we call the \emph{powered Catalan numbers}, is quite intriguing, and further investigated. We provide two different succession rules for it, denoted ΩpCat\Omega_{pCat} and Ωsteady\Omega_{steady}, and show that they define two types of families enumerated by powered Catalan numbers. Among such families, we introduce the \emph{steady paths}, which are naturally associated with Ωsteady\Omega_{steady}. They allow us to bridge the gap between the two types of families enumerated by powered Catalan numbers: indeed, we provide a size-preserving bijection between steady paths and valley-marked Dyck paths (which are naturally associated with ΩpCat\Omega_{pCat}). Along the way, we provide several nice connections to families of permutations defined by the avoidance of vincular patterns, and some enumerative conjectures.Comment: V2 includes modifications suggested by referees (in particular, a much shorter Section 3, to account for arXiv:1706.07213

    Lossy Compressor preserving variant calling through Extended BWT

    Full text link
    A standard format used for storing the output of high-throughput sequencing experiments is the FASTQ format. It comprises three main components: (i) headers, (ii) bases (nucleotide sequences), and (iii) quality scores. FASTQ files are widely used for variant calling, where sequencing data are mapped into a reference genome to discover variants that may be used for further analysis. There are many specialized compressors that exploit redundancy in FASTQ data with the focus only on either the bases or the quality scores components. In this paper we consider the novel problem of lossy compressing, in a reference-free way, FASTQ data by modifying both components at the same time, while preserving the important information of the original FASTQ. We introduce a general strategy, based on the Extended Burrows-Wheeler Transform (EBWT) and positional clustering, and we present implementations in both internal memory and external memory. Experimental results show that the lossy compression performed by our tool is able to achieve good compression while preserving information relating to variant calling more than the competitors. Availability: the software is freely available at https://github.com/veronicaguerrini/BFQzip.Comment: Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologie

    phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering

    Get PDF
    Molecular phylogenetics is a fundamental branch of biology. It studies the evolutionary relationships among the individuals of a population through their biological sequences, and may provide insights about the origin and the evolution of viral diseases, or highlight complex evolutionary trajectories. In this paper we develop a method called phyBWT, describing how to use the extended Burrows-Wheeler Transform (eBWT) for a collection of DNA sequences to directly reconstruct phylogeny, bypassing the alignment against a reference genome or de novo assembly. Our phyBWT hinges on the combinatorial properties of the eBWT positional clustering framework. We employ eBWT to detect relevant blocks of the longest shared substrings of varying length (unlike the k-mer-based approaches that need to fix the length k a priori), and build a suitable decomposition leading to a phylogenetic tree, step by step. As a result, phyBWT is a new alignment-, assembly-, and reference-free method that builds a partition tree without relying on the pairwise comparison of sequences, thus avoiding to use a distance matrix to infer phylogeny. The preliminary experimental results on sequencing data show that our method can handle datasets of different types (short reads, contigs, or entire genomes), producing trees of quality comparable to that found in the benchmark phylogeny

    Myelodysplastic syndromes: advantages of a combined cytogenetic and molecular diagnostic workup

    Get PDF
    In this study we present a new diagnostic workup for the myelodysplastic syndromes (MDS) including FISH, aCGH, and somatic mutation assays in addition to the conventional cytogenetics (CC). We analyzed 61 patients by CC, FISH for chromosome 5, 7, 8 and PDGFR rearrangements, aCGH, and PCR for ASXL1, EZH2, TP53, TET2, RUNX1, DNMT3A, SF3B1 somatic mutations. Moreover, we quantified WT1 and RPS14 gene expression levels, in order to find their possible adjunctive value and their possible clinical impact. CC analysis showed 32% of patients with at least one aberration. FISH analysis detected chromosomal aberrations in 24% of patients and recovered 5 cases (13.5%) at normal karyotype (two 5q- syndromes, one del(7) case, two cases with PDGFR rearrangement). The aGCH detected 10 "new" unbalanced cases in respect of the CC, including one with alteration of the ETV6 gene. After mutational analysis, 33 patients (54%) presented at least one mutation and represented the only marker of clonality in 36% of all patients. The statistical analysis confirmed the prognostic role of CC either on overall or on progression-free-survival. In addition, deletions detected by aCGH and WT1 over-expression negatively conditioned survival. In conclusion, our work showed that 1) the addition of FISH (at least for chr. 5 and 7) can improve the definition of the risk score; 2) mutational analysis, especially for the TP53 and SF3B1, could better define the type of MDS and represent a "clinical warning"; 3) the aCGH use could be probably applied to selected cases (with suboptimal response or failure)

    RANK-Dependent Autosomal Recessive Osteopetrosis: Characterization of Five New Cases With Novel Mutations

    Get PDF
    Autosomal recessive osteopetrosis (ARO) is a genetically heterogeneous disorder attributed to reduced bone resorption by osteoclasts. Most human AROs are classified as osteoclast rich, but recently two subsets of osteoclast-poor ARO have been recognized as caused by defects in either TNFSF11 or TNFRSF11A genes, coding the RANKL and RANK proteins, respectively. The RANKL/RANK axis drives osteoclast differentiation and also plays a role in the immune system. In fact, we have recently reported that mutations in the TNFRSF11A gene lead to osteoclast-poor osteopetrosis associated with hypogammaglobulinemia. Here we present the characterization of five additional unpublished patients from four unrelated families in which we found five novel mutations in the TNFRSF11A gene, including two missense and two nonsense mutations and a single-nucleotide insertion. Immunological investigation in three of them showed that the previously described defect in the B cell compartment was present only in some patients and that its severity seemed to increase with age and the progression of the disease. HSCT performed in all five patients almost completely cured the disease even when carried out in late infancy. Hypercalcemia was the most important posttransplant complication. Overall, our results further underline the heterogeneity of human ARO also deriving from the interplay between bone and the immune system, and highlight the prognostic and therapeutic implications of the molecular diagnosis. © 2012 American Society for Bone and Mineral Researc

    Human neutralizing antibodies to cold linear epitopes and subdomain 1 of the SARS-CoV-2 spike glycoprotein

    Get PDF
    Emergence of SARS-CoV-2 variants diminishes the efficacy of vaccines and antiviral monoclonal antibodies. Continued development of immunotherapies and vaccine immunogens resilient to viral evolution is therefore necessary. Using coldspot-guided antibody discovery, a screening approach that focuses on portions of the virus spike glycoprotein that are both functionally relevant and averse to change, we identified human neutralizing antibodies to highly conserved viral epitopes. Antibody fp.006 binds the fusion peptide and cross-reacts against coronaviruses of the four genera, including the nine human coronaviruses, through recognition of a conserved motif that includes the S2´ site of proteolytic cleavage. Antibody hr2.016 targets the stem helix and neutralizes SARS-CoV-2 variants. Antibody sd1.040 binds to subdomain 1, synergizes with antibody rbd.042 for neutralization and, like fp.006 and hr2.016, protects mice expressing human ACE2 against infection when present as bispecific antibody. Thus, coldspot-guided antibody discovery reveals donor-derived neutralizing antibodies that are cross-reactive with Orthocoronavirinae, including SARS-CoV-2 variants

    Colorectal Cancer Stage at Diagnosis Before vs During the COVID-19 Pandemic in Italy

    Get PDF
    IMPORTANCE Delays in screening programs and the reluctance of patients to seek medical attention because of the outbreak of SARS-CoV-2 could be associated with the risk of more advanced colorectal cancers at diagnosis. OBJECTIVE To evaluate whether the SARS-CoV-2 pandemic was associated with more advanced oncologic stage and change in clinical presentation for patients with colorectal cancer. DESIGN, SETTING, AND PARTICIPANTS This retrospective, multicenter cohort study included all 17 938 adult patients who underwent surgery for colorectal cancer from March 1, 2020, to December 31, 2021 (pandemic period), and from January 1, 2018, to February 29, 2020 (prepandemic period), in 81 participating centers in Italy, including tertiary centers and community hospitals. Follow-up was 30 days from surgery. EXPOSURES Any type of surgical procedure for colorectal cancer, including explorative surgery, palliative procedures, and atypical or segmental resections. MAIN OUTCOMES AND MEASURES The primary outcome was advanced stage of colorectal cancer at diagnosis. Secondary outcomes were distant metastasis, T4 stage, aggressive biology (defined as cancer with at least 1 of the following characteristics: signet ring cells, mucinous tumor, budding, lymphovascular invasion, perineural invasion, and lymphangitis), stenotic lesion, emergency surgery, and palliative surgery. The independent association between the pandemic period and the outcomes was assessed using multivariate random-effects logistic regression, with hospital as the cluster variable. RESULTS A total of 17 938 patients (10 007 men [55.8%]; mean [SD] age, 70.6 [12.2] years) underwent surgery for colorectal cancer: 7796 (43.5%) during the pandemic period and 10 142 (56.5%) during the prepandemic period. Logistic regression indicated that the pandemic period was significantly associated with an increased rate of advanced-stage colorectal cancer (odds ratio [OR], 1.07; 95%CI, 1.01-1.13; P = .03), aggressive biology (OR, 1.32; 95%CI, 1.15-1.53; P < .001), and stenotic lesions (OR, 1.15; 95%CI, 1.01-1.31; P = .03). CONCLUSIONS AND RELEVANCE This cohort study suggests a significant association between the SARS-CoV-2 pandemic and the risk of a more advanced oncologic stage at diagnosis among patients undergoing surgery for colorectal cancer and might indicate a potential reduction of survival for these patients

    SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues

    Get PDF
    Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types
    corecore